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(54) Electronic document processing systems. 

(57) Provision is made in electronic document processing systems for printing unfiltered or filtered 
machine-readable digital representations of electronic documents, and human-readable renderings of 
them on the same record medium using the same printing process. The integration of machine-readable 
digital representations of electronic documents with the human-readable hardcopy renderings of them 
may be employed, for example, not only to enhance the precision with which the structure and content 
of such electronic documents can be recovered by scanning such hardcopies into electronic document 
processing systems, but also as a mechanism for enabling recipients of scanned-in versions of such 
documents to identify and process annotations that were added to the hardcopies after they were 
printed and/or for alerting the recipients of the scanned-in documents to alterations that may have been 
made to the original human-readable content of the hardcopy renderings. 

In addition to storage of the electronic representation of the document, provision is made for 
encoding information about the electronic representation of the document itself, such as file name, 
creation and modification dates, access and security information, printing histories. Provision is also 
made for encoding information which is computed from the content of the document and other 
information, for purposes of authentication and verification of document integrity. Provision is also 
made for the encoding of information which relates to operations which are to be performed depending 
on handwritten marks made upon a hardcopy rendering of the document; for example, encoding 
instructions of what action is to be taken when a box on a document is checked. Provision is also made 
for encoding in the hardcopy another class of information : information about the rendering of the 
document specific to that hard copy, which can include a numbered copy of that print, the identification 
of the machine which performed that print, the reproduction characteristics of the printer, the screen 
frequency and rotation used by the printer in rendering halftones. Provision is also made for encoding 
information about the digital encoding mechanism itself, such as information given in standard- 
encoded headers about subsequently compressed or encrypted digital information. 
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This invention relates to electronic document pro- 
cessing systems and, more particularly, to methods 
and means for more tightly coupling the usual 
hardcopy output of such systems to the electronic 
documents from which the human readable 
hardcopies are produced. The coupling afforded by 
this invention may be sufficiently tight to enable prin- 
ted, human readable hardcopy documents to be 
employed as an essentially lossless medium for stor- 
ing and transferring digital electronic documents. 
Alternatively, such coupling may be utilized to capture 
otherwise unavailable or not easily discernible infor- 
mation relevant to the reproduction of the electronic 
source document 

Modern electronic document processing systems 
generally include input scanners for electronically 
capturing the genera! appearance (i. e. t the human 
readable information content and the basic graphical 
layout) of human readable hardcopy documents; pro- 
grammed computers for enabling users to create, edit 
and otherwise manipulate electronic documents; and 
printers for producing hardcopy, human readable ren- 
derings of electronic documents. These systems typi- 
cally have convenient access to mass memory for the 
storage and retrieval of electronic document files. 
Moreover, they often are networked by local area net- 
works (LANs), switched data links, and the tike for 
facilitating the interchange of digital electronic docu- 
ments and for providing multi-user access to shared 
system resources, such as high speed electronic prin- 
ters and electronic file servers. 

The technical details pertaining to the 
interchangeability of electronic documents are 
beyond the scope of this invention, but it should be 
understood that there is not yet an "universal 
interchange standard" for losslessly interchanging 
"structured electronic documents" (i. e., documents 
conforming to predefined rules governing their con- 
stituent elements, the characteristics of those ele- 
ments, and the interrelationships among their 
elements). Plain text ASCII encoding is becoming a 
de facto interchange standard, but it is of limited utility 
for representing structured electronic documents. 
Other encoding formats provide fuller structural rep- 
resentations of electronic documents, but they usually 
are relatively system specific. For example, some of 
the more basic document description languages 
(DDLs) employ embedded control codes for supple- 
menting ASCII encodings with variables defining the 
logical structure (i. e., the sections, paragraphs, sent- 
ences, figures, figure captions, etc.) of electronic 
documents, thereby permitting such documents to be 
formatted in accordance with selected formatting vari- 
ables, such as selected font styles, font sizes, line and 
paragraph spacings, margins, indentations, header 
and footer locations, and columns. Graphical DDL 
encodings provide more sophisticated and complete 
representations of electronic document structures 



because they encode both the logical structure and 
the layout structure of such documents. Page descrip- 
tion language (PDL) encodings are related to graphi- 
cal DDL encodings, but they are designed so that they 

5 can be readily decomposed or interpreted to define 
the detailed layout of the printed page in a raster scan 
format. Accordingly, it will be appreciated that the 
transportability of electronic documents from one 
document processing system to another depends 

10 upon the ability of the receiving or "target" system to 
interpret, either directly or through the use of a format 
converter, the encoding format in which the document 
is provided by the originating or "source" system. To 
simplify this disclosure, source/target encoding for- 

15 mat compatibility will be assumed, but it should be 
clearly understood that this is a simplifying assump- 
tion. 

Others previously have proposed printing digital 
data, including electronic document files, on a record 

20 medium, such as plain paper, so that optical readers 
can be employed foruploading the data into electronic 
document processing systems. See, for example, US- 
A-4,754,127 and US-A-4,782,221. in view of the 
additional insights provided by the user documen- 

25 tation for 'The Laser Archivist," it is believed that the 
so-called "data strips" this prior work has provided are 
printed as physically distinct entities. Accordingly, the 
user can use a standard "cut and paste" process for 
attaching such data strips, if desired, to the human 

30 readable renderings of the files to which they pertain. 
In this system, the scanner used to read the printed 
data strips is not a general-purpose document scan- 
ner, but rather, a special-purpose hand-held com- 
puter peripheral optimized for reading the data strips, 

35 as disclosed in US-A-4,692,603; this system could not 
be said to close the loop between common document 
production and reprographic equipment, as the pre- 
sent invention intends. US-A- 4,665,004, also is inter- 
esting because it proposes using a specialized optical 

40 recording system and record medium for optically 
recording the raw digital data for a computer-gener- 
ated pictorial image in a form that permits the raw data 
(including digitized versions of any optional written or 
oral annotations) to be physically secured to the 

45 human readable, hardcopy rendering of the image. 
However, that approach has the drawback of requir- 
ing the use of different recording mechanisms for pro- 
ducing the machine-readable digital data 
representation and the human-readable rendering. 

so Moreover, the digital data is not recorded in a form 
that permits it to be readily copied using ordinary 
office equipment. 

US-A-4,728,984 is believed to be especially 
noteworthy because it relates to the use of an elec- 

55 tronic printer for recording digital data on plain paper, 
together with the use of an input scanner for scanning 
digital data that has been recorded on such a record 
medium to upload the data into the internal computer 
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of the printer. The '984 patent discusses several sub- 
jects which are meaningful to the present invention, 
including the redundant recording of digital infor- 
mation, the archival storage and distribution of digital 
data recorded on plain paper, the compression that s 
can be achieved by digitally recording text and 
graphics, the data security that can be achieved by 
encrypting digitally recorded text and graphics, 
Moreover, it discloses a typical printer and a typical 
input scanner in substantial detail. 10 

Paper documents still are a primary medium for 
written communications and for record keeping. They 
can be replicated easily by photocopying, they can be 
distributed and filed in original or photocopied form, 
and facsimiles of them can be transmitted to remote 15 
locations over the public switched telephone network. 
Paper and other hardcopy documents are so perva- 
sive that they are not only a common output product 
of electronic document processing systems, but also 
an important source of input data for such systems. 20 

In recognition of the fundamental role human- 
readable hardcopy documents play in modern soci- 
ety, input scanners have been developed for 
uploading them into electronic document processing 
systems. These scanners typically convert the 25 
appearance of the hardcopy into a raster formatted, 
digital data stream, thereby providing a bit-mapped 
representation of the hardcopy appearance. How- 
ever, bit maps require relatively large amounts of 
memory and are difficult to edit and manipulate, so 30 
substantial effort and expense have been devoted to 
the development of recognition processes for convert- 
ing bit-mapped document appearances into corre- 
sponding symbolic encodings. Unfortunately, 
recognition processes generally are inferential and of 35 
limited scope, so they have difficulty correlating 
unusual bit map patterns with corresponding encod- 
ings, and they are prone to making inference errors 
even when they determine that a correlation exists. 

Turning for a moment to the conventional 40 
hardcopy output of electronic document processing 
systems, it will be evident that a hardcopy rendering 
of an electronic document often is only a partial rep- 
resentation of the content of the corresponding elec- 
tronic document file. The appearance of a hardcopy 45 
rendering is governed by the structure and content of 
the electronic document to which it pertains, but the 
digital data encodings which define the structure and 
content of the electronic document are not explicitly 
embodied by the rendering. So-called "intelligent" 50 
input scanners (scanners equipped with substantial 
image-processing software) having sufficient 
knowledge of the structural encoding rules theoreti- 
cally can recover the structural encodings for at least 
some types of electronic documents from hardcopy 55 
renderings of them, but the practical results frequently 
do not conform to the theoretical expectations, espe- 
cially if the hardcopy is distorted (such as by a photo- 



copying or facsimileprocess), damaged or altered 
prior to being scanned. 

Furthermore, some types of electronic document 
data are virtually impossible to infer from a hardcopy 
rendering. For example, electronic spreadsheets con- 
ventionally include computational algorithms for defi- 
ning the computations which are required to compute 
the speadsheet, but these algorithms generally are 
not explicitly set forth in the hardcopy rendering of the 
computed spreadsheet. Likewise, electronic hyper- 
text documents and multimedia documents ordinarily 
contain pointers which link them to related electronic 
documents, but the links provided by those pointers 
usually are not embodied in the hardcopy renderings 
of such documents. Still another example is provided 
by computer-generated synthetic graphical images, 
where the control points for the graphical objects that 
form the image and the data defining the curves which 
fit those control points normally can only be approxim- 
ated from a hardcopy rendering of such an image. As 
still another example, it will be understood that prints 
generated by computer-aided design (CAD) systems 
typically are approximate representations of the high 
precision data of the underlying electronic file, which 
often contains three-dimensional information. As a 
general rule, the mathematical models and the related 
data from which such a system generates such prints 
is not fully recoverable from a hardcopy rendering rep- 
resenting any single view. As a further example, it is 
to be understood that the color values for objects 
(such as the cyan, magenta, yellow and black values 
for printed four-color images) also are difficult to 
ascertain with any substantial certainty from a 
hardcopy color rendering, and would be impossible to 
recover from a black&white copy of that color docu- 
ment hardcopy. There are times when documents are 
printed in black and white as a result of the limited 
capabilities of the available printer, even though the 
original electronic source document might have been 
intended to provide a full color, a functional color, or 
a highlight color representation, indeed, even some of 
the more fundamental attributes of electronic docu- 
ments, such as their file names, author, creation date, 
etc., are seldom found in the hardcopy renderings of 
such documents. 

Consequently, it will be evident that it would be a 
significant improvement if the ordinary hardcopy out- 
put of electronic document processing systems could 
be employed as an essentially lossless medium for 
storing all or part of the structure and content of elec- 
tronic documents, and for transferring that data from 
the printer of one electronic document processing 
system to the input scanner of the same or another 
document processing system. Hardcopy documents 
of that type would not only continue to function as a 
convenient medium for distributing and storing hu- 
man-readable renderings of electronic documents, 
but also would provide a convenient alternative to the 
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digital mass memories which customarily are used for 
storing electronic documents, and to the digital data 
links and removable digital recording media which 
normally are employed for transferring electronic 
documents from one location to another. Further- 5 
more, the integration of machine-readable digital rep- 
resentations of electronic documents with 
human-readable renderings of them would permit 
various combinations of human and computer infor- 
mation processing steps to be employed for proces- 10 
sing information more easily and quickly. 

Therefore, in accordance with the present inven- 
tion, provision is made in electronic document proces- 
sing systems for printing unfiltered or filtered (i.e., 
complete or partial, uncompressed or compressed) 15 
machine-readable digital representations of elec- 
tronic documents and human-readable renderings of 
them on the same record medium using the same 
printing process. The integration of machine-readable 
digital representations of electronic documents with 20 
the human-readable hardcopy renderings of them 
may be employed, for example, not only to enhance 
the precision with which the structure and content of 
such electronic documents can be recovered by scan- 
ning such hardcopies into electronic document pro- 25 
cessing systems, but also as a mechanism for 
enabling recipients of scanned-in versions of such 
documents to identify and process annotations that 
were added to the hardcopies after they were printed, 
and/or for alerting the recipients of the scanned-in 30 
documents to alterations that may have been made to 
the original human-readable content of the hardcopy 
renderings. 

In addition to storage of a complete or partial elec- 
tronic representation of the document and/or its con- 35 
tent, this invention may be utilized for encoding 
information about the electronic representation of the 
document itself, such as file name, creation and modi- 
fication dates, access and security information, print- 
ing histories. Provision may also be made for 40 
encoding information which is computed from the con- 
tent of the document and other information, for pur- 
poses of authentication and verification of document 
integrity and for computational purposes, such as the 
recomputation of a spreadsheet. Furthermore, provi- 45 
sion may be made for the encoding of information 
which relates to operations which are to be performed 
depending on handwritten marks made upon a 
hardcopy rendering of the document; for example, 
instructions controlling the action which is to be taken 50 
when a box on a document is checked. Still further, 
this invention may be employed for encoding in the 
hardcopy another class of information: information 
about the rendering of the document specific to a 
single, given hard copy, which can include a num- 55 
bered copy of that print, the identification of the 
machine which performed that print, the reproduction 
characteristics of the printer, the screen frequency 



and rotation used by the printer in rendering halftones, 
and the identity or characteristics of the print medium 
and marking agents (such as the paper and 
xerographic toner, respectively) Moreover, provision 
also may be made for encoding information about the 
digital encoding mechanism itself, such as infor- 
mation given in standard-encoded headers about 
subsequently compressed or encrypted digital infor- 
mation. 

When the electronic document includes a scan- 
ned-in image, this invention may be utilized for sup- 
plementing the hardcopy rendering of such a 
document with embedding data characterizing the 
input scanner and the scan process responsible for 
inputting the image. Similarly, when a hardcopy is 
reproduced by a fight-lens or electronic copier or a 
facsimile system, data characterizing the reproduc- 
tion equipment and process can be embedded in the 
hardcopy reproduction. 

Still another possible application for the present 
invention relates to augmentation of hardcopy render- 
ings with data defining various active and passive 
user aids which exist in the electronic document 
domain. For example, electronic buttons, soft keys, 
drawing brushes, magnifying tools, phone tools and 
document feed arrows could be transferred in this 
way. 

As will be appreciated, the supplemental data 
may be embedded in the hardcopy renderings in a 
variety of ways. For example, it may be organized 
hierarchically to ensure the inclusion and robust sur- 
vival of the more important information. Some or all of 
the data may be redundantly recorded on the 
hardcopy renderings to increase their likelihood of 
surviving copying and handling. Moreover, the redun- 
dantly-recorded data may aid in recovering lower 
priority, non-redundantly recorded data from the hu- 
man-readable content of the rendering, or the 
hardcopy recorded data may include pointers to sour- 
ces of backup data, should a backup source be 
required. 

The present invention will now be described by 
way of example with reference to the accompanying 
drawings, in which: 

Figure 1 is a functional schematic diagram of a 
relatively fully featured, known electronic docu- 
ment processing system; 
Figure 2 is another functional schematic diagram 
illustrating certain of the enhancements this 
invention provides for electronic document pro- 
cessing systems of the same general type as 
shown in Fig. 1; 

Figures 3 and 4 depict digitally augmented docu- 
ments produced in accordance with this inven- 
tion, and 

Figure 5 illustrates some of the document proces- 
sing applications and work-ways which are facili- 
tated by this invention. 
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Turning now to the drawings, and at this point 
especially to Fig. 1, existing electronic document pro- 
cessing systems 1 1 , typically include (i) an inputscan- 
ner 12 for inputting or "uploading" human-readable 
hardcopy documents 13 into the system, (ii) a prog- 
rammed computer 14, such as a personal computer 
or a workstation, for creating, editing and manipulat- 
ing digital electronic documents, and (iii) a bitmap 
printer 1 5 and/or a dot matrix or fully formed character 
printer 16 for outputting or "downloading" human- 
readable hardcopy renderings of electronic docu- 
ments from the system. 

There are a wide variety of known input devices 
which a user may employ for creating, editing and 
manipulating electronic documents. For example, a 
keyboard 21 ordinarily is provided for inputting 
typographic data, generally together with a predeter- 
mined set of control codes. Additionally, a pointing 
device, such as a mouse 22, commonly is utilized for 
controlling the positioning of a cursor on a monitor (not 
shown) that provides the visual feedback which 
assists the user to interact with the computer 14 effec- 
tively. Modern user interfaces, such as the graphical 
user interfaces that are becoming increasingly popu- 
lar for personal computers and workstations, often 
extend the functionality of the mouse-like pointer 22 
so that it can be employed, together with a few keys- 
trokes on the keyboard 21, to input a relatively rich 
and easily extensible set of control codes. There are 
still other input devices 24, such as stylus-sensitive 
digitizing pads, voice digitizers and video digitizers 
(not shown), which may be utilized for inputting hand- 
written data (e. g. t free-hand sketches, signatures, 
etc.), voice annotations and video data into the docu- 
ment processing system 11. Furthermore, as des- 
cribed in some additional detail hereinbelow, the input 
scanner 12 is available for inputting hardcopy docu- 
ments, including hardcopy output from the document 
processing system 1 1 and from other electronic docu- 
ment processing systems (not shown), as well as 
hardcopy documents created manually and by other 
types of marking mechanisms, such as standard 
typewriters. 

Document assembly software 31 residing in the 
computer 14 interprets the input data and the control 
codes that are fed into the computer 14 to produce 
structured electronic documents 32. Each of these 
electronic documents typically is identified by a locally 
unique file name 33 which may be assigned to the 
electronic document 32 by the user, as shown, or by 
the computer 14 under program control. Typically, the 
document assembly software 31 is application speci- 
fic, but the lines between different applications are 
becoming blurred with the emergence of integrated 
multi-function software, such a the Xerox Viewpoint 
environment. For example, in the case of text entered 
via the keyboard 21, the ASCII encodings 35 of the 
typographic characters are combined in the document 



assembly software 31 with control codes to provide 
DDL encodings for insertion into a structured text file 
(or, in the case of an electronic document which per- 
mits of mixed data types, into a text frame) 32. A sig- 

5 nificant portion of the logical structure of the electronic 
document 32 usually is explicitly defined by its compo- 
sition, without requiring any additional intervention by 
the user. However, provision normally is made for 
enabling the user to enter document formatting com- 

10 mands, as at 36 and 37, to override the default values 
which the document assembly software 31 otherwise 
would employ for defining the layout structure of the 
document 32. 

As is known, structured electronic documents 32 

15 can be interchanged between DDL compatible elec- 
tronic document processing systems, as at 41, 
through the use of removable digital record media, 
such as floppy disks and the like, and through the use 
of digital data links. Furthermore, networked docu- 

20 ment processing systems typically are able to 
interchange electronic documents, either directly by 
means of a direct file transfer protocol or electronic 
mail as at 42, or indirectly by means of shared elec- 
tronic file servers 43. 

25 Hardcopy renderings 45 of locally or remotely 

produced structured electronic documents 32 can be 
printed from a DDL encoding by employing, for 
example, a suitable print driver for driving a standard 
character printer 45. Alternatively, a PDL encoding of 

30 the document 32 may be composed, as at 46, to pro- 
vide a PDL master 47 which, in turn, can be decom- 
posed, as at 48, to provide an electronic bitmap 
representation 49 of the document 32 for printing by 
a bitmap printer 50. PDL masters, such as the master 

35 47, also are structured electronic documents which 
can be interchanged among PDL-compatible elec- 
tronic document processing systems by means of 
physically removable record media 41, direct file 
transfer protocols/electronic mail 42, and shared file 

40 servers 43. 

Like any other hardcopy document, the hardcopy 
rendering 45 of an electronic document 32 may be 
photocopied by a ligh/Iens copier, as at 53, or by a 
digital copier, as at 54. Additionally, a copy of the ren- 

45 dering 45 may be transmitted to or received from a 
remote location via facsimile. Standard photocopying 
and facsimile processes tend to cause some distor- 
tion of the image, so the copies they produce often are 
somewhat degraded, especially when the copies are 

so several copy generations removed from the original 
rendering 45. 

As will be understood, the hardcopy input 61 for 
the input scanner 12 may be the original or a copy of 
the rendering 45 or of a similar hardcopy rendering 

55 from another electronic document processing system 
(not shown). Alternatively, the input document 61 may 
be the original or a copy of a document created man- 
ually or through the use of a mechanical or elec- 
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tromechanical marking mechanism, such as a stan- 
dard typewriter and the like. Additionally, the original 
human readable information content of the document 
61 might be supplemented by various annotations 
and editorial markings. Also, changes may have been 5 
made to the original human-readable information con- 
tent of the document 61, with or without any intent to 
deceive. 

In accordance with standard practices, to capture 
the human-readable information content of the docu- 10 
ment 61 electronically, the input scanner 12 first con- 
verts the appearance or image of the document 61 
into an electronic bitmap 62. Recognition software 63 
then usually is employed for converting the bitmap 
representation 62 into elemental textual and graphical 15 
encodings to the extent that the recognition software 
63 is able to establish a correlation between elements 
of the bitmap image 62 and the features it is able to 
recognize. For example, state-of-the-art recognition 
software 63 generally can correlate printed 20 
typographic characters with their ASCII encodings, as 
at 64, with substantial success. Additionally, the rec- 
ognition software 63 sometimes is able to perform 
some or all of the following tasks: (a) infer some or all 
of the page-layout features of the document 61 from 25 
its bit map representation 62, thereby establishing a 
basis for supplying page-layout control codes as at 
65, (b) make probablistic (e.g., "nearest-fit") determi- 
nations with respect to the font or fonts used to print 
text appearing in the document 61 , thereby providing 30 
a foundation for supplying font control codes as at 66, 
and (c) fully or partially decomposing line drawings 
appearing in the document 61 into "best-guess" vec- 
tors, thereby providing a basis for supplying corre- 
sponding vector encodings as at 67. However, even 35 
with these various recognition tools, the recognition 
software 63 often is unable to recognize some of the 
features of the document 61, so it usually also 
includes provision for inserting the bit maps for unrec- 
ognized images into image frames. Therefore, the 40 
electronic representation of the document 61 that the 
document processing system 11 receives from its 
input scanner 1 2 typically is composed of probablistic 
encodings, bit map images, or some combination of 
those two. Moreover, the input scanner 12 has no 45 
mechanism for recovering data relating to the docu- 
ment 61 beyond whatever is inferrable from its 
appearance. 

In Fig. 2, like reference numerals have been used 
to identify like parts, so the following discussion will so 
focus primarily on the provision that has been made 
in the electronic document printing system 11A for 
printing a human-readable rendering 45 of an elec- 
tronic document 32, together with a digital, machine- 
readable representation 101 of that same electronic 55 
document 32 on the same record medium 102 through 
the use of the same printer 45 or 50. In accordance 
with this invention, for integrating a digital, machine- 



readable representation 101 of the electronic docu- 
ment 32 with the human-readable rendering of it, the 
bit-level digital data content of the ASCII, DDL or PDL 
encodings of all or selected portions of the electronic 
document 32 is encoded at 105 to convert it into 
"glyph encodings" (encodings representing distinc- 
tive markings having at least two distinguishable, ma- 
chine-readable states - viz.,a true ("1") state and a 
false ("O") state). These glyph encodings are then 
merged into the electronic document description file 
for the electronic document 32 to cause the glyphs to 
be printed on the hardcopy output document 102 at 
one or more selected locations. 

As will be appreciated, the printed glyphs may 
take various forms. For example, they may be binary 
bar codes composed of black and white markings 
which, by their presence, absence, or spacing repre- 
sent the true ("1") and false ( H 0 H ) states of the data 
bits. Alternatively, they may be markings which pro- 
vide two or more levels of machine-readable discrimi- 
nation by virtue of their shape, rotation, density or 
similar attributes. The glyphs may be machine-read- 
able by means of human-invisible characteristics of 
the print materials, such as their infrared reflectivity, 
their high resolution spectral detail, their metameric 
spectral characteristics, or their magnetization. These 
machine-detectable materials may be incorporated 
into the same printing process that is employed for 
printing the human-readable rendering, such as by 
utilizing xerographic toners which have machine- 
recognizeable, human-invisible characteristics, 
together with their usual visible characteristics of 
color, whiteness, blackness, transparency and 
opacity. 

Furthermore, the glyphs may be printed at vari- 
ous locations on the hardcopy document 102. For 
instance, one or more fields may be set aside in the 
top, bottom, right-hand or left-hand margins of the 
document 102 for the printing of such glyphs. Alterna- 
tively, as shown in Figs 3 and 4, the glyphs may be 
printed in machine-identifiable glyph frames which 
are fully or partially confined within the margins of the 
human-readable field of the document 102, or fully 
outside those margins. Glyph frames may be dis- 
tinguished from any human readable information with 
which they are intermixed, such as by causing the 
printer 45 or 50 to mark their boundaries with a distinc- 
tive, machine-recognizeable border pattern as at 1 1 1 
in Fig. 3, or by printing each line of glyphs between 
machine recognizeable "start" codes and "end" codes 
as at 1 12 and 113, respectively, in Fig. 4. Still another 
option is to print the glyphs in a predetermined region 
on the document 102 using a machine-recognizeable 
attribute of the printing process or of the glyph pattern 
to distinguish the glyphs from human-readable infor- 
mation that is printed within the same region of the 
document. For instance, the glyph patterns may be 
machine-distinguishable by the shape and periodic 
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placement of the glyphs. Moreover, patterns of fine 
scale glyphs may be organized to create human-read- 
able markings on a coarser scale, such as text, logos, 
decorative frames, and background settings. 

In keeping with this invention, all or only selected 
portions of the ASCII, DDL or PDL digital encodings 
of the electronic document 32 may be printed on the 
hardcopy document 102. Furthermore, the digital rep- 
resentation of the electronic document that is printed 
on the hardcopy 102 may be utilized in lieu of or to 
augment the recognition software 63 (Fig. 1) for 
uploading an editable copy of the electronic document 
32 into the document processing system 11 A. More 
particularly, if all of the digital data contained by the 
electronic document 32 are printed in digital data form 
on the hardcopy 1 02, the electronic document 32 can 
be recovered merely by employing the input scanner 
12for scanning the glyph-encoded data to recover the 
data that affects the appearance of the document, as 
at 121, as well as the data that are not inferrable from 
the appearance of the document, as at 122. For 
example, the appearance-related data that may be 
recovered at 121 include the ASCII text bits as at 123, 
the font style bits as at 124, and the page layout data 
as at 124: all of which may read out directly from the 
glyph-encoded data for application to the workstation 
13, without any intermediate processing. Appea- 
rance-related image data (i. e., bit maps) also can be 
recovered from glyph-encoded data embedded in the 
hardcopy document 102, but such image data are 
stored in an image frame, as at 125, for application to 
the workstation 12 in order to allow optimal uptake of 
the image frame (for example, information about the 
screen frequency and rotation of halftoned images 
can aid in their conversion for subsequent reformat- 
ting, displaying and printing, avoiding the degradation 
in image quality normally occurring in electronic re- 
screening). 

In short, this invention provides a less error prone 
alternative to employing conventional recognition 
techniques for recovering the digital data defining the 
recognizeable features of the human-readable, 
hardcopy rendering of the electronic source docu- 
ment 32. However, even if conventional recognition 
techniques are employed for recovering the digital 
data defining some or all of those features, it will be 
evident that this invention enables data which are 
potentially important to the accuracy and/or complete- 
ness of the reconstruction of the electronic source 
document 32 to be recovered, even if such data are 
not evident or inferrable from the appearance of the 
human-readable rendering of the source document. 
For example, the glyph-encoded data that are embed- 
ded in the hardcopy document 102 may include one 
or more of the following: machine-readable descrip- 
tions of the data points for structured graphics as at 
131, machine-readable descriptions of the algorithms 
utilized for performing computations for spreadsheets 



and the like as at 132, machine-readable descriptions 
of hypertext pointer values as at 133, machine-read- 
able descriptions of some or all of the structural 
characteristics of the electronic source document as 

5 at 134, machine-readable descriptions of the docu- 
ment editor used to prepare the source document 32, 
as at 135, machine-readable descriptions of the file 
name and storage location of the electronic source 
document 32, as at 136, and machine-readable des- 

10 criptions of audit-trail data for the electronic source 
document 32, as at 137. 

As will be appreciated, the foregoing examples of 
the types of digital data which this invention permits 
to be stored on and recovered from the hardcopy out- 

15 put of electronic document processing systems are 
not exhaustive. For instance, in color systems it may 
be desirable to record the color values (typically, 
cyan, magenta, yellow and black) digitally for the 
pixels of the human-readable hardcopy rendering so 

20 that those values can be reliably recaptured from the 
hardcopy. As still another example, it may be desir- 
able to record data identifying the toner and/or the 
fonts employed for printing a xerographic rendering of 
an electronic document, to assist a document recog- 

25 nition system with the interpretation of such a render- 
ing. In other words, this invention may be utilized for 
storing and communicating a machine-readable des- 
cription of all or any selected part of the electronic 
source document 32, as well as a like description of 

30 the equipment and process employed for producing 
the source document32and the human-readable ren- 
dering 45 of it. Moreover, such digital data descrip- 
tions can be redundantly recorded if desired 
(assuming that adequate space is available on the 

35 hardcopy document 102 for such redundant record- 
ing), thereby reducing the risk that critical data will be 
lost as a result of the ordinary wear and tear the 
hardcopy 102 may experience. 

Fig. 5 schematically illustrates a few of the work 

40 ways that are enabled by this invention. Colloborating 
authors 151 and 152 may exchange document drafts 
electronically or in hardcopy form, using ordinary print 
facilities 153, input scanning facilities 154 and mass 
storage facilities 155. Such documents can be printed 

45 to include digitally-embedded data descriptions and 
can be distributed by mail, as at 156, in digitally-aug- 
mented hardcopy document form 157 to an editor 
158. where the electronic document can be recap- 
tured with substantial fidelity by an input scanner 161 

so for editing on a workstation 162. When the editor 158 
is finished with the document (or when an editorial as- 
sistant or typist 159 is finished with it, such as in a 
workgroup utilizing a shared processing node 160), 
the document may be reprinted by a printer 163 for 

55 further distribution in hardcopy form, as at 164, but it 
may now be further augmented, as at 165, with data 
describing some or all of the editorial actions that 
have been taken. During this or any of the other 
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phases of the "hardcopy" distribution process, the 
document 165 may pass through various "smart" 
copying processes, facsimile processes, scanning 
processes and printing processes, as at 167, during 
which data describing those processes may be added 5 
to it before it is returned to the original author or 
authors in hardcopy form, as at 168, to be electroni- 
cally recaptured by them through the use of the input 
scanner 1 54. 

It will be appreciated that the present invention w 
provides relatively straightforward and reliable 
methods and means for capturing and communicat- 
ing, in fully integrated hardcopy form, digital data des- 
cribing the structure and content of the electronic 
source document underlying a human-readable 15 
hardcopy rendering of the electronic document, as 
well as digital data defining the equipment and pro- 
cess employed to prepare the source document and 
to produce the rendering. Furthermore the types of 
digital data which may be captured and communi- 20 
cated in accordance with this invention may be deter- 
mined to satisfy the requirements of various 
applications and operating environments and may 
vary significantly from case-to-case. 

25 

Claims 

1. A electronic document processing system (11) 
having an input scanner (12) for scanning and 30 
electronically capturing information carried by 
hardcopy documents, a digital computer (14) hav- 
ing one inlet coupled to the input scanner and 
another inlet coupled to a user interface for enabl- 
ing users to create, edit and manipulate electronic 35 
data files, and a digital printer (15) coupled to the 
computer for printing human-readable, hardcopy 
renderings of selected electronic data files; com- 
prising 

means for embedding digital data pertain- 40 
ing to the hardcopy renderings in the renderings, 
the data being selected to aid such an electronic 
document processing system in interpreting such 
renderings when the renderings are scanned into 
such an electronic document processing system. 45 

2. The system as claimed in Claim, 1 wherein the 
data embedded in the renderings digitally define 
at least selected portions of the human-readable 
information displayed by the renderings. 50 

3. The system as claimed in Claim 1 ro 2, wherein 
the data embedded in the renderings digitally 
define at least one attribute of the electronic docu- 
ment files from which the renderings are printed, 55 
such attribute not being displayed by such ren- 
dering. 



4. The system of Claim 3, wherein the attribute 
relates to at least one structural characteristic of 
the electronic document files. 

5. The system of Claim 4, wherein the electronic 
document files are encoded in accordance with 
predetermined structural encodings, and said 
embedded data define such encodings. 

6. The system of Claim 3 or any claim dependent 
therefrom, wherein the attribute relates to at least 
one operation performed by the electronic pro- 
cessing system for producing the electronic docu- 
ment files. 

7. The system of Claim 3 or any claim dependent 
therefrom, wherein the attribute relates to at least 
one operation performed by the electronic pro- 
cessing system for processing scanned-in repre- 
sentations of the electronic document files. 

8. The system of any preceding Claim, wherein the 
data embedded in the renderings digitally defines 
at least one attribute of the printer which produces 
the renderings. 

9. The system of any preceding Claim, wherein the 
renderings are composed of at least two colors, 
and 

the data embedded in the renderings 
quantitatively define compositional values of 
each color for at least selected portions of the ren- 
derings, 

1 0. The system of any preceding Claim, wherein 

at least some of the data embedded in the render- 
ings are compressed in accordance with a pre- 
determined compression algorithm, and 

another portion of the embedded data 
specifies a decompression algorithm for decom- 
pressing the compressed data. 
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ABSTRACT: 

CHG DATE= 199906 17 STATUS=C» Provision is made in electronic 
document processing systems for printing unfiltered or filtered machine- 
readable digital representations of electronic documents, and human- 
readable renderings of them on the same record medium using the same 
printing process. The integration of machine-readable digital representations 
of electronic documents with the human-readable hardcopy renderings of 
them may be employed, for example, not only to enhance the precision with 
which the structure and content of such electronic documents can be 
recovered by scanning such hardcopies into electronic document processing 
systems, but also as a mechanism for enabling recipients of scanned-in 
versions of such documents to identify and process annotations that were 
added to the hardcopies after they were printed and/or for alerting the 
recipients of the scanned-in documents to alterations that may have been 
made to the original human-readable content of the hardcopy renderings. In 
addition to storage of the electronic representation of the document, 
provision is made for encoding information about the electronic 
representation of the document itself, such as file name, creation and 
modification dates, access and security information, printing histories. 
Provision is also made for encoding information which is computed from 
the content of the document and other information, for purposes of 
authentication and verification of document integrity. Provision is also made 
for the encoding of information which relates to operations which are to be 
performed depending on handwritten marks made upon a hardcopy 
rendering of the document; for example, encoding instructions of what 
action is to be taken when a box on a document is checked. Provision is also 
made for encoding in the hardcopy another class of information: information 
about the rendering of the document specific to that hard copy, which can 
include a numbered copy of that print, the identification of the machine 
which performed that print, the reproduction characteristics of the printer, 
the screen frequency and rotation used by the printer in rendering halftones. 
Provision is also made for encoding information about the digital encoding 
mechanism itself, such as information given in standard-encoded headers 

about subsequently compressed or encrypted digital information. LJLJ 
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